Leveraging Compounds to Improve Noun Phrase Translation from Chinese and German
نویسندگان
چکیده
This paper presents a method to improve the translation of polysemous nouns, leveraging on their previous occurrence as the head of a compound noun phrase. First, the occurrences are identified through pattern matching rules, which detect occurrences of an XY compound followed closely by a potentially coreferent occurrence of Y , such as “Mooncakes . . . cakes . . .”. Second, two strategies are proposed to improve the translation of the second occurrence of Y : re-using the cached translation of Y from the XY compound, or post-editing the translation of Y using the head of the translation of XY . Experiments are performed on Chineseto-English and German-to-French statistical machine translation, with about 250 occurrences of XY/Y , from the WIT3 and Text+Berg corpora. The results and their analysis suggest that while the overall BLEU scores increase only slightly, the translations of the targeted polysemous nouns are significantly improved.
منابع مشابه
Feature-Rich Statistical Translation of Noun Phrases
We define noun phrase translation as a subtask of machine translation. This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features. We achieved 65.5% translation accuracy in a German-English translation task vs. 53.2% with IBM Model 4.
متن کاملEffects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation
Flat noun phrase structure was, up until recently, the standard in annotation for the Penn Treebanks. With the recent addition of internal noun phrase annotation, dependency parsing and applications down the NLP pipeline are likely affected. Some machine translation systems, such as TectoMT, use deep syntax as a language transfer layer. It is proposed that changes to the noun phrase dependency ...
متن کاملBilingually-Constrained Recursive Neural Networks with Syntactic Constraints for Hierarchical Translation Model
Hierarchical phrase-based translation models have advanced statistical machine translation (SMT). Because such models can improve leveraging of syntactic information, two types of methods (leveraging source parsing and leveraging shallow parsing) are applied to introduce syntactic constraints into translation models. In this paper, we propose a bilingually-constrained recursive neural network (...
متن کاملInvestigating Embedded Question Reuse in Question Answering
The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...
متن کاملIdentifying main obstacles for statistical machine translation of morphologically rich South Slavic languages
The best way to improve a statistical machine translation system is to identify concrete problems causing translation errors and address them. Many of these problems are related to the characteristics of the involved languages and differences between them. This work explores the main obstacles for statistical machine translation systems involving two morphologically rich and under-resourced lan...
متن کامل